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ABSTRACT 

Clusters of galaxies are important laboratories for understanding both galaxy evolu- 
tion and constraining cosmological quantities. Any analysis of clusters, however, is best 
done when one can reliably determine which galaxies are members of the cluster. While 
this would ideally be done spectroscopically, the difficulty in acquiring a complete sam- 
ple of spectroscopic redshifts becomes rather daunting, especially at high redshift where 
the background contamination becomes increasingly larger. Traditionally, an alterna- 
tive approach of applying a statistical background correction has been utilized, which, 
while useful in a global sense, does not provide information for specific galaxies. In 
this paper, we develop a more robust technique which uses photometrically estimated 
redshifts to determine cluster membership. This technique can either be used as an 
improvement over the commonly used statistical correction method or it can be used to 
determine cluster candidates on an individual galaxy basis. By tuning the parameters of 
our algorithm, we can selectively maximize our completeness or, alternatively, minimize 
our contamination. Furthermore, our technique provides a statistical quantification of 
both our resulting completeness and contamination from foreground and background 
galaxies. 

Subject headings: galaxies: distances and redshifts and clusters: general; cosmology: 
observations 

1. Introduction 

In the wake of the Hubble deep field campaigns (Williams et al. 1996), and with the growth 
of large photometric surveys [e.g., DPOSS, SDSS, etc.), the techniques involved in calculating 
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photometric redshifts have become increasingly more sophisticated and precise (see Weymann et al. 
1999, for a recent synopsis of the field). Accuracies of = 0.05 can be routinely achieved (Brunner 
et al. 1999). Therefore, it is now possible to use photometric redshifts to answer specific scientific 
questions (e.g., Connolly et al. 1998; Brunner et al. 2000). In light of this, we have begun a 
program to use accurate photometric redshifts to study the galaxy populations in high-redshift 
clusters of galaxies. For this study, we have examined the most well-studied sample of optically- 
selected, high-redshift clusters to date that of Oke, Postman & Lubin (1998). Over the past few 
years, Oke et al. (1998) have undertaken a detailed photometric, spectroscopic, and morphological 
survey of nine candidate clusters at z > 0.6. The survey consists of deep BVRIK' photometry, 
over 130 low-resolution Keck spectra per cluster field, and high-spatial-resolution WFPC2 imagery 
from the Hubble Space Telescope (HST). 

Previous work on similar galaxy cluster samples at various redshifts has relied on statistical 
corrections in order to remove contamination due to field galaxies when spectroscopic identifications 
are not available for all galaxies in the cluster field {e.g., Aragon-Salamanca et al. 1993; Lubin 
1996; Small et al. 1997; Dressier et al. 1997; Couch et al. 1998; Stanford et al. 1998; Lubin et al. 
1998). These types of corrections, however, are fraught with potential systematic effects, including 
cosmological variance, weak lensing, and sample selection criteria. While these effects will only 
serve to increase the noise of any measurement made over a statistically complete sample, they 
will systematically affect any measurements made on an individual cluster {e.g., the Morphological 
fraction or the Morphology-Density relationship), introducing a bias into the actual measurement 
process. Furthermore, these line-of-sight contamination corrections apply to the entire galaxy 
sample and do not provide any information for specific galaxies. As a result, we have decided to 
explore the role photometric redshifts could play in improving the reliability of identifying galaxy 
cluster members (Kodama et al. 1999; Lubin & Brunner 1999). 

In this paper, we first describe in Section 2 the details of the data observations, reduction, 
and catalog generation, emphasizing relevant points which are important for the estimation of 
photometric redshifts. In Section 3 we describe the details of our redshift estimation technique. 
Next, we detail the cluster galaxy identification procedure in Section 4. We conclude with a 
discussion of our results and future applications. 

2. Data 

For the photometric redshift analysis, we have used the photometric and spectroscopic data 
from the original cluster survey of Oke et al. (1998). This survey consists of nine candidate clusters 
at redshifts of z > 0.6 : CI 0023-^0423, CI 0231-^0048, CI 0943-^4804, CI 1324-^3011, CI 1325-^3009, 
CI 1604-^4304, CI 1604-^4321, CI 1607+4109, and CI 2157-^0347. Oke et al. (1998) have determined 
that six of the nine candidate clusters are indeed real density enhancements, consistent with that of 
a cluster of galaxies. The observational data relevant for the work described in this paper consists 
of B, V, R, I, and K' band imaging and spectroscopic observations of selected galaxies in each 
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field. 

2.1. Observations and Data Reduction 

All of the ground-based optical observations, both broad band and spectroscopic, were taken 
with the Low Resolution Imaging Spectrometer (LRIS; Oke et al. 1995) on either the Keck I or the 
Keck II 10-m telescopes. The infrared imaging was performed with the Infrared Imaging Camera 
(IRIM) on the Mayall 4-m telescope at Kitt Peak National Observatory. In this section, we briefly 
describe the observations and the data reduction; however, the reader is referred to Oke et al. 
(1998) for a complete account of these observations. 

2.1.1. Broad-Band Optical 

The photometric survey was conducted in four broad band filters, BVRI, which closely match 
the Cousins system. The response curves of these filters are shown in Figure 1 of Oke et al. (1998). 
In imaging mode, LRIS covers a ficld-of-vicw of 6 x 8 arcmin. The total exposure times in each 
filter are 3600, 2000, 1200, and 900 seconds for B, V, R, and /, respectively, and were chosen 
to give fairly uniform errors in the photometry. The B, V, and R observations consisted of two 
equal exposures to allow for accurate cosmic ray rejection. The I exposure time was broken into 
3 equal exposures in order to cope with the cosmic rays, as well as to avoid approaching the CCD 
saturation level. 

The LRIS imaging data were reduced in the standard fashion. All frames were bias-subtracted, 
and pixel-to-pixel sensitivity variations were removed using dome flats. Large-scale gradients were 
removed by dividing each frame by a normalized two-dimensional spline fit to the sky values in a 
sky- fiat. The sky- flat was created by generating a median image from a stack of frames for each 
night and passband. For the / band data, a two-dimensional fringe map was also created from the 
median filtered image by removing large-scale gradients within the median image. Fringing was 
removed from each / band frame by subtracting a suitably scaled version of the fringe map. 

Image registration for a given cluster field was performed by identifying approximately 10 
unsaturated stars (detectable in all 4 passbands) to be used as astrometric reference points. The 
mean X and Y offsets of these stars in every frame taken of the cluster were computed relative to 
their locations in a fiducial B band image. All image data for the cluster were shifted to match 
the B band coordinate system using a flux conserving Lagrangian interpolation scheme to achieve 
registration at the sub-pixel level. Once all frames of a given cluster were registered to a common 
coordinate system, the independent exposures in each passband were co-added to produce the final 
four BVRI images. 

The Keck imaging has been calibrated to the standard Cousins-Bessell-Landolt (Cape) system 
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through exposures of a number of Landolt standard star fields (Landolt 1992). For a circular 
aperture with a radius of 3", the approximate limiting magnitudes are B = 25.1, V = 24.1, 
R = 23.5, and I = 21.7 for a 5-a detection. 

2.1.2. Spectroscopic 

Multi-slit observations of the cluster field were made with LRIS in spectroscopic mode using a 
300 g mm^^ grating blazed at 5000 A. The chosen grating provided a dispersion of 2.35 A per pixel 
and a spectral coverage of 5100 A. The grating angle was set in order to provide coverage from 
approximately 4400 A to 9500 A in the first order. A GG495 glass filter was used to eliminate the 
overlapping second order spectrum; there is, therefore, no second order contamination below 9700 
A. In order to obtain the full wavelength range along the dispersion axis, the field-of-view of the 
spectral observations was reduced from that of the imaging mode to approximately 2x8 arcmin. 

Spectroscopic candidates were chosen from preliminary R band imaging of each cluster field. 
All objects brighter than approximately R = 23.5 it 0.1 within the spectroscopic field-of-view of 
LRIS were included as candidates for slit-mask spectra. For each cluster field, six different slit- 
masks were made with approximately 30 objects per mask (including guide stars and duplicate 
observations). The exposure time for each mask was 1 hour. Flat-fielding and wavelength calibra- 
tion were performed using internal fiat-field and arc lamp exposures which were taken after each 
science exposure. 

In practice, about 130 spectra were acquired per cluster field, with ~ 100 yielding measurable 
redshifts. A redshift measurement procedure which relies, in part, on visual inspection was used. 
The quality of the redshift identification was ranked with a number from 1 to 4, which roughly 
corresponds to the number of features used to identify the redshift. A quality of 4 means that the 
redshift is certain; a quality of 1 means that only one emission line was observed, and the redshift is 
only possible. Because the fore/background contamination rate is so large at our cluster redshifts, 
approximately 50 - 85% of the galaxies turn out to be field galaxies, rather than cluster members 
(Oke et al. 1998; Postman et al. 1998, 2000). 

2.1.3. Infrared 

Deep infrared imaging of the cluster sample was taken with IRIM which contains a 256 x 256 
NICM0S3 HgCdTe array and has a resolution of 0.60 arcsecond per pixel on the 4-m telescope. 
The field-of-view (154 x 154 arcseconds) covers the central region of each cluster. In order to make 
the deepest possible observations over this region the entire LRIS field-of-view was not mosaiced. 
Observations were made in the 2.2/x K' band. Each central field was observed using a 4 x 4 dither 
pattern with a stepsize of lO" and a total extent of 30". Each exposure had an effective exposure 
time of 1 minute, with 4 co-additions of individual, background-limited 15 second integrations. 
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The total integration time on an individual cluster varied between 3 and 4.4 hours. Because the 
fields were not excessively crowded, we were able to use in-field dithering to create a global sky flat. 

The K' cluster data were reduced using the Deep Infrared Mosaicing Software (DIMSUM), a 
publicly available package of IRAF scripts. This software generates not only a final stacked image 
but also a corresponding exposure image, where each pixel encodes the total number of seconds 
in the corresponding stacked image. This exposure image was appropriately scaled and used as a 
weight image by SExtractor during the catalog generation (see §2.2). 

The data were linearized, trimmed to exclude masked columns and rows on the edges of the 
arrays, and dark-subtracted using dark frames of the same exposure length as the observations. All 
images of a given night were flattened by a super fiat made from a series of dome flats taken during 
the previous day. As part of the DIMSUM procedure, sky subtraction was done by subtracting a 
scaled median of nine temporally adjacent exposures for each frame. A first-pass reduction was 
used to create an object mask for each frame. This mask was created from a fully stacked mosaic 
image. It therefore excluded not only the bright objects, but also those objects too faint to be 
detected in an individual exposure. In the second pass, the object mask was used to avoid object 
contamination of the sky fiat in the production of sky frames. Final mosaicing of the images of 
each cluster were made with a replication of each pixel by a factor of 4 in both dimensions. This 
procedure conserves flux while eliminating the need for interpolation when the individual frames 
are co-aligned. A bad pixel mask was used to exclude bad pixels from the final summed images. 

Absolute photometric transformations were derived from the observations of Persson et al. 
(1998) HST standard stars. Each standard star was observed every night in five separate array 
positions. The observations of the standard stars were reduced in a similar manner to the cluster 
data (see Oke et al. 1998). An approximate limiting magnitude of K' = 20 for a ha detection was 
reached in a standard aperture of radius S'.'O. 

2.2. Catalog Generation 

We used SExtractor version 2.1.5 (Bertin &; Arnouts 1996) to perform the source detection 
and photometry because this program is able to detect objects in one image and analyze the 
corresponding pixels in a separate image. Applied uniformly to multi-band image data (i.e. use the 
same detection image for all measurement images), this method will produce a matched aperture 
photometric catalog. We generated an optimal detection image from the B,V,R,&: I images using 
a process (Szalay et al. 1998). Briefly, this process involves convolving each input image with a 
Gaussian kernel matched to the seeing. The convolved images were squared and normalized so that 
they have zero mean and unit variance. The four processed images (corresponding to the original 
B,V,R,&L I images) were coadded, forming the detection image. 

In order to determine the optimal threshold parameters for source detection, we compared a 
histogram of the pixel distribution in the image with a x^ function with four degrees of freedom. 
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(This distribution theoretically corresponds to the background pixel distribution for our coadded 
image.) By taking the difference between the two histograms (pixel minus theory), wc generated 
the histogram of pixel values which were due to the objects that we were trying to detect. We 
defined the Bayesian detection threshold as the intersection of the sky (or theoretical prediction) 
and object pixel distributions {i.e. where the object pixel flux becomes dominant). To convert this 
empirical threshold for use with SExtractor, we scaled the threshold (which is a flux per pixel 
value) into a surface brightness threshold (which is in magnitudes per square arcsecond). 

The infrared images were cataloged separately due to their radically different fleld-of-view and 
pixel scale. The catalog was generated using SExtractor as before but in single image mode {i.e. the 
detection image was also the measurement image). In order to maximize the number of detections, 
we utilized the exposure image generated by DIMSUM during the data reduction (see §2.1.3) to 
provide a weight map in order to properly determine object parameters in an image with varying 
signal-to-noise (due to the different total integration times in different pixels). 

All of our resultant analyses use the total magnitudes calculated by SExtractor. The optical 
magnitudes were, by design, matched apertures which reduces the scatter in any photometric 
redshift technique. Since the K' band images were not co-registered with the optical images, the 
aperture can be slightly different. However, since the total apertures were rather large in angular 
extant, any "aperture effects" on our resulting analysis should be small (< 0.02"*) compared to the 
more dominant effects of photometric errors. 

2.3. Cross-identifications 

The last step in preparing these data for analysis was to combine the optical imaging, near- 
infrared imaging, and the spectroscopic observations into a single dataset for each cluster field. Since 
the original spectroscopic targets were determined from a separate catalog, we first cross-matched 
the spectroscopic catalog with the optical catalog using a growing annulus technique where the 
angular distance ip was determined using the formula 

ip = arccos(sin((/)s) sm{(pp) + cos(^s) cos{(pp) cos{6s — Op)) 

where the angles {6 = Declination, (j) = Right Ascension) have been properly converted into radians, 
and the subscript, s, refers to spectroscopic target and the subscript, p, refers to photometric object. 
In all cases, the match-up occurred for very small angles because spectroscopic targets were selected 
from the optical images. 

The cross-identification between the infrared and optical catalogs, however, was slightly more 
complicated, although it was done in an identical manner. This was due to the different telescopes 
and detectors used during the observations, and the corresponding geometric distortions present 
between the two catalogs. Furthermore, the variation of the spectral energy distributions of the 
sources in our sample between the / and K' bands can be quite large, especially for galaxies at 
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relatively high redshift. As a result, some objects in each catalog remained unmatched below the 
angular limit which is approximately the confusion beam width (approximately five arcseconds). 

3. Empirical Photometric Redshift 

The technique that we have developed for determining cluster membership requires redshift 
estimates for all galaxies in the field. Although the technique is independent of the actual redshift 
estimation technique, we have used the empirical photometric redshift technique (Brunner et al. 
1999) to generate redshift estimates for all galaxies in our sample. This is the result of the simplicity 
of the empirical technique in determining redshifts for a photometric sample of galaxies for which 
an adequate training set exists. Furthermore, the empirical technique, since it is based only on the 
data under investigation, is more robust than alternative techniques to uncertainties in the spectral 
energy distributions of galaxies. These alternative techniques, such as template-based methods, 
could also be used, as long as a realistic estimate of the error in the estimated redshift can be 
determined. 

The empirical photometric redshift technique requires a calibration sample to determine the 
coefficients of the empirical relation. For this work, we have used the spectroscopic data which 
was acquired as part of the original observational campaign to determine the cluster velocity dis- 
persions (Postman et al. 1998, 2000). The reliability (i.e. the inverse of the intrinsic error) of a 
spectroscopic redshift is generally quite high, with quoted errors often measured in the tenths of a 
percent (i.e. 5 < 0.001). In reality, however, a redshift is accurate only when the spectral identifica- 
tion is also accurate. As a result, we restricted the spectroscopic calibrators to include only objects 
with redshifts that are determined by at least two spectral lines, i.e. a quality of 2 or higher (for 
details, see Postman et al. 1998). The next step was to restrict the sample to those objects which 
were below the 10% photometric error limit, which did not have bad detection flags {e.g., object 
near edge of frame, incomplete aperture data), and which were below our high redshift cut-off of 
z = 1.5. We also removed one additional galaxy which, having an E + A spectrum (Dressier &: 
Gunn 1983, 1992; Zabludoff et al. 1996), skewed the results of our calibration for normal galaxy 
spectra (which are the dominant population in our sample). The final spectroscopic calibration 
data consisted of 130 galaxies from five different cluster fields. 

With the final calibration sample, we determined a fourth order polynomial fit to the four 
magnitudes B,R,I,Sz K' , and the spectroscopic redshift. Our sample of 130 calibrators is nearly 
twice the number of coefficients required (a fourth order polynomial in four variables has 70 co- 
efficients); we are, therefore, properly constraining the degrees of freedom when determining the 
coefficients of our polynomial fit. From Figure 1, there are no obvious systematic variations in our 
calculated relationship, which has an intrinsic dispersion (the standard deviation of the residual 
differences for the calibrating sample) of az = 0.072. In addition, the distribution of the residuals 
(see Figure 2) are approximately Gaussian in shape. 
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3.1. Estimating Photometric Redshift Errors 

In order to reliably determine cluster membership, we also need to have an estimate for the 
error in our photometric redshift calculation. While the measurement of the intrinsic dispersion in 
our relationship provides one estimate (which is commonly used throughout the relevant literature, 
e.g., Kodama et al. 1999), we have also developed an alternative, extrinsic estimate. Briefly, 
the intrinsic error is calculated directly from any available spectroscopic redshifts; that is, it is 
simply the dispersion in the derived relationship between spectroscopic and photometric redshifts. 
Extrinsic errors, on the other hand, are calculated in a Monte-Carlo fashion, accounting for all 
known measurement uncertainties. 

Following Brunner et al. (1999), we derive extrinsic redshift error estimates for all galaxies in 
our dataset using a Monte-Carlo technique to simulate our uncertainty in both the observables [i.e. 
flux measurements), as well as our sampling of the galaxy distribution in the B,R,I,Sz K' four- 
dimensional flux space. Specifically, we use a bootstrap with replacement algorithm to randomly 
draw a new set of calibrating galaxies from the original training set. This algorithm allows for 
duplicate galaxy entries and is designed to emphasize any incompleteness in the sampling of the 
true distribution of galaxies in the four dimensional space B,R,I,&i K' by the calibration redshifts. 
Furthermore, the magnitudes of the newly derived calibrating sample were drawn from a Gaussian 
probability distribution function with the mean given by the measured magnitude and sigma by 
the magnitude error. This accounts for any uncertainties in the flux measurements. 

This technique was used to generate one hundred different realizations of the photometric 
redshift relationship. For each different realization (all of which form our ensemble), a photometric 
redshift was calculated for every galaxy in the photometric catalog. As a result, this produces 
one hundred different redshift estimates for each galaxy. By applying a quantile cut, we can 
approximate the redshift distribution for each individual galaxy by a Gaussian. Therefore, the 
error in the photometric redshift for each object was defined by = (Q5 — Q2)/2.0, where Q5 and 
Q2 refer to the fifth and second quantiles of the distribution of each galaxy's estimated redshift, 
respectively (see Brunner et al. 1999). Together, our original photometric redshifts and redshift 
error estimates completely define the photometric redshift distribution for each individual galaxy. 

Of the two types of redshift error estimates, the intrinsic error is by far the most commonly 
used {e.g., Kodama et al. 1999), as it is the easiest to calculate. The extrinsic error, however, 
does a better job of quantifying the actual error in the redshift estimate of an individual galaxy, 
which can be both smaller or considerably larger than the intrinsically estimated global error (see 
Figure 15 in Brunner et al. 1999). Fundamentally, the extrinsic method provides an error estimate 
separately for each individual galaxy, while the intrinsic method provides only a global estimate. 
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4. Cluster Membership 



While the optimal technique for unambiguously determining cluster membership is to obtain 
spectroscopic redshifts for all sources in the field, this process can be prohibitively expensive in 
telescope time allocation, especially at relatively high redshifts {z > 0.5). As a result, many 
previous studies {e.g., Aragon-Salamanca et al. 1993; Lubin 1996; Small et al. 1997; Dressier et 
al. 1997; Couch et al. 1998; Stanford et al. 1998; Lubin et al. 1998) have utilized a background 
contamination estimator in order to statistically quantify both the number of galaxies which belong 
to a given cluster and the morphological fraction of the cluster members. In addition to possible 
systematic effects which are implicit in using techniques of this type, background corrections provide 
no information on the likelihood of an individual galaxy being a member of a cluster. 

Photometric redshifts, however, provide a relatively inexpensive redshift estimate; therefore, 
they can be used to estimate cluster redshifts {e.g., Gal et al. 2000) and to determine likely clus- 
ter members (Kodama et al. 1999; Lubin &: Brunncr 1999). Previous techniques to select cluster 
members based on photometric redshifts [e.g., Kodama et al. 1999) are based on a simplistic 
technique for defining galaxy cluster members. Basically, any galaxy which lies within a redshift 
shell centered on the galaxy cluster's redshift {i.e. Zc ± Sz) are identified as members. Because 
photometric redshifts have a large uncertainty (generally a factor of ten to fifty higher than spec- 
troscopic redshifts), a large redshift shell {e.g., Sz ~ 0.05) is used to identify likely cluster members. 
While certainly useful, this approach does not provide a reliable means for identifying line-of-sight 
contaminating galaxies. 

As a result, we have developed an alternative technique which relies on the probabilistic in- 
terpretation of a photometric redshift to determine cluster membership. We define the probability 
density function, ^{z), for an individual galaxy's redshift to be a Gaussian probability distribution 
function with mean (/x) given by the estimated photometric redshift and standard deviation {a) 
defined by the estimated error in the photometric redshift. 



where A?^ is a suitable normalization factor, Zc is the cluster redshift, Az is the width in redshift 




Using this interpretation, we can calculate the probability that a galaxy has an actual redshift 
within a given redshift interval. 
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Within this formalism, the only undeclared quantity is a or, alternatively, the uncertainty in 
the estimated redshift for a given galaxy. In our framework, this value can either be determined 
from the intrinsic error in our photometric redshift relation or in a separately calculated extrinsic 
error. As a result, this approach is strongly dependent on the actual technique used to estimate 
the photometric redshift error and corresponding redshift error. In this paper, we have applied this 
technique using empirically derived photometric redshifts since we have a well-defined calibration 
dataset of spectroscopic redshifts. Empirically defined photometric redshifts are much less sensitive 
to uncertainties in the shape and the evolution with redshift of the spectral energy distribution of 
galaxies than competing techniques such as template photometric redshifts (Brunner 1997). 

On the other hand, the method we have introduced can easily be adapted to work with al- 
ternative redshift estimation procedures, including template photometric redshift algorithms, by 
utilizing the derived dispersion estimate. The empirical-based techniques do have shortcomings; 
most notably, the requirement of a high quality training set and the fundamental limitation that 
any resultant analysis is limited to the same region of flux space delineated by the training set of 
calibration redshifts. In our case, the spectroscopic survey of Oke et al. (1998) provides a won- 
derful training set, which completely samples the region in flux space which encompasses the vast 
majority of cluster galaxies. When this is not the case, template based photometric redshift esti- 
mation methods, which can be applied to data where there are no calibrators, are required. In this 
case, however, extreme care must be used to minimize any systematic effects, such as morphological 
variation of the redshift error estimates or template incompleteness. 

By using the calibration galaxies as a training set, we can empirically determine the optimal 
probability value which provides the threshold for cluster membership. This can be tuned to either 
maximize the completeness or minimize contamination. In this paper, our sample is small enough 
that a simple visual threshold determination is sufficient; however, there is no reason that a more 
powerful maximum likelihood or regression analysis technique could not be employed, especially for 
larger datasets {e.g., the SDSS). Furthermore, additional information (such as evolutionary color 
tracks) can be used to select likely cluster members which can be used to provide a bootstrap 
estimator for the probability threshold. 

As a demonstration of this technique, we have applied this procedure to the data described 
earlier in this paper. We consider two clusters from our survey (see Section 2 for more information), 
the first is CI 0023-1-0423 at a redshift of ^ = 0.84 (Postman et al. 1998; Lubin, Postman & Oke 
1998; Lubin et al. 1998), and the second is CI 1604-1-4321 at a redshift of z = 0.92 (Postman et 
al. 2000; Lubin et al. 2000). These two clusters provide a useful demonstration because the first 
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cluster is well sampled by our dataset, while the second cluster pushes the photometric limits of 

our data. Since the utility of this technique is so strongly dependent on the exact method used 
for estimating the redshift and rcdshift error for every galaxy in the field, we have intentionally 
relegated a more complete discussion of the application of this technique to our own data to a later, 
more targeted paper (Lubin et al. 2000). 

In Figures 3 and 4, we display the results of applying this technique to the spectroscopic targets 
in the CI 0023+0423 cluster field, as well as the full photometric sample for this field. In both cases, 
we used the intrinsic error of the photometric redshift relationship as our redshift error. Prom the 
first panel, we clearly do quite well {i.e. 100% completeness with no contamination). Of course, 
translating this success to the full sample is not guaranteed; however, as long as our calibration 
sample adequately samples the four dimensional flux space occupied by the galaxies in our full 
sample {i.e. we have a fair and unbiased training set), there is no reason to expect significant 
variation from these results. 

In Figures 5 and 6, we show the results of applying this technique to the spectroscopic targets 
in the CI 1604+4321 cluster field, as well as the entire photometric sample for this cluster. This 
time, however, we have used the extrinsic error which is calculated independently for each galaxy. 
By judiciously selecting the probability threshold, we can maximize the completeness (which is 
~ 88%), while minimizing the contamination (which is ~ 21%). While we would clearly prefer full 
completeness and no contamination, this cluster demonstrates one of the strengths of our technique, 
namely the ability to empirically quantify both our completeness and contamination. 

In fact, our technique predicts 37 cluster members within the field of view of our combined 
image (see Figure 6). If we use the predictions of deep number counts {e.g., Driver et al. 1995; 
Small et al. 1995; Abraham et al. 1996) to estimate the contamination due to foreground and 
background galaxies, we estimate there should be 33 ± 14 cluster members in the image (note 
the large uncertainty in this statistical calculation), which is the exact number that we find after 
correcting for contamination and incompleteness [i.e. 37 x (1 — 21%)/88% = 33]. We have made this 
comparison with the statistical background correction simply in order to show that we are finding 
a reasonable number of cluster galaxies; however, unlike the background correction technique, we 
have explicitly identified the likely cluster members. 

5. Discussion 

In this paper, we have presented a novel technique for the probabilistic determination of cluster 
membership based solely on photometric redshift and redshift error estimates. Our technique can 
be tuned to either maximize the completeness of cluster member identification {i.e. identify all 
actual cluster members), or, alternatively, to minimize the contamination due to foreground and 
background galaxies in the field. Furthermore, this technique provides a robust estimation for 
both the completeness and contamination fractions of identified cluster members as a function of 
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likelihood. A comparison between our new technique with the traditional technique of statistical 
background correction shows remarkable agreement, with the caveat that our technique actually 
identifies the galaxies which are cluster members. 

The results in this paper are based on an empirical photometric redshift estimation, which we 
have used because (a) wc have a reliable training set from our extensive spectroscopic survey, and (b) 
these photometric redshifts are less sensitive to uncertainties in spectral evolution which can affect 
other redshift estimation techniques. Alternative redshift estimation techniques, however, can easily 
be incorporated into the algorithm, as all that is needed is a redshift estimate and a corresponding 
redshift error estimate. In the case of template photometric redshifts, the redshift error could be 
determined from either the intrinsic dispersion in the photometric redshift relation, an interpolation 
over the goodness-of-fit parameter, or even more appropriately through a combination of the 
parameter and a Monte-Carlo bootstrap approach to mimic photometric uncertainty and template 
incompleteness. In this case, however, caution must be used to minimize any possible systematic 
effects which might depend on various fundamental assumptions, such as the spectral type. 

The only free parameter in the entire technique is the probability threshold used to determine 
cluster membership. Optimally, as we have done in this paper, a training set of spectroscopically 
confirmed cluster members can be used to empirically set the threshold value. When this is not 
the case, alternative techniques, such as color-selection to identify the early-type sequence at the 
appropriate redshift, can be used to identify likely cluster candidates. On the other hand, the 
expected number of cluster candidates can be calculated from the statistical background correction 
arguments. This expected number of cluster members can be used as an independent estimator as 
we can set the probability threshold value to reproduce the estimated number of cluster galaxies. 

Based on our work, we believe that photometric redshifts are an ideal way to study the galaxy 
populations in high-redshift clusters. For a large number of the tests of common cluster properties 
{e.g., the Morphology-Density relationship), this method eliminates the need for extensive spectro- 
scopic surveys and the uncertainty of estimating the background contamination. As an additional 
example of the utility of this approach, consider the magnitude-limited, spectroscopic survey used 
in this analysis. We can approximate the total number of spectroscopic targets observed by mul- 
tiplying the number of clusters used in this analysis (five) by the approximate number of targets 
per cluster field (130), implying that a total of 650 objects were observed spectroscopically. The 
redshift identification success rate was nearly 77%. Of these redshifts, 50%-85% belong to non- 
cluster galaxies. These numbers imply that the spectroscopic survey was only 12%-38% efficient 
in identifying cluster candidates, depending on the richness of the cluster. 

In addition, our technique provides the ability to tune the cluster candidate identification to 
either maximize the completeness or, alternatively, to minimize the contamination. As a practical 
example, maximizing completeness is essential for any observational program which is interested 
in completely sampling the cluster (e.g., spectroscopic targeting). On the other hand, minimizing 
contamination is important for selecting a characteristic set of cluster members for a detailed 
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study of cluster member properties {e.g., the Morphological fraction or the Morphology-Density 
relationship; Lubin et al. 2000). Furthermore, if a training set of spectroscopically confirmed 
galaxies is available, we can actually characterize the percentages of both the completeness and 
contamination that wc expect in our final cluster candidate catalog. 

Like any photometric rcdshift based technique, our technique is limited in its efficacy by the 
quality of the photometric data used in the analysis. However, our technique can be applied using a 
variety of photometric redshift estimation methods. Therefore, we expect that, in the near future, 
we will be able to fully utilize the capabilities of this technique for cluster research. 
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Fig. 1. — The correlation between the photometric and spectroscopic redshifts for the entire cah- 
bration sample. The straight line is of unit slope and is not a fit to the actual data. 
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Fig. 2. — A histogram of the residual differences between photometric and spectroscopic redshifts 
for the entire caUbration sample, which is approximately Gaussian in shape. The bin width was 
chosen to approximate the measured intrinsic dispersion in our relationship. 
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Fig. 3. — Results for the field of CI 0023+0423. A plot displaying the locations of calibrating 
galaxies in the image (the pixel scale is 0.215 arcsec/pixel). The spectroscopically confirmed cluster 
members are indicated by o, photometrically classified cluster members are indicated by x, and + 
indicates field galaxies which are additional calibrators for the photometric redshift relation. With 
the probability cut selected for this figure, we have 100% completeness and 0% contamination. 
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Fig. 4. — Results for the field of CI 0023+0423. A demonstration of the application of this technique 
to the full photometric catalog. Cluster candidates are indicated as before (both spectroscopic 
and photometric), while new cluster candidates are indicated by + and general field galaxies are 
indicated by the dots. After correcting for contamination and incompleteness, the number of cluster 
members determined through our photometric redshift technique agrees with the number predicted 
by statistically correcting for background counts. 
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Fig. 5. — Results for the field of CI 1604+4321. A plot displaying the locations of calibrating 
galaxies in the image (the pixel scale is 0.215 arcscc/pixel). The spectroscopically confirmed cluster 
members are indicated by o, photometrically classified cluster members are indicated by x, and + 
indicates field galaxies which are additional calibrators for the photometric redshift relation. With 
the probability cut selected for this figure, we have 88% completeness and ^ 21% contamination. 
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Fig. 6. — Results for the field of CI 1604+4321. A demonstration of the application of this technique 
to the full photometric catalog. Cluster candidates are indicated as before (both spectroscopic 
and photometric), while new cluster candidates are indicated by + and general field galaxies are 
indicated by the dots. After correcting for contamination and incompleteness, the number of cluster 
members determined through our photometric redshift technique agrees with the number predicted 
by statistically correcting for background counts. 



